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10 HUMAN GROWTH HORMONE VARIANTS 

The present invention is based upon the discovery of 
variant proteins of human growth hormone and to the 
determination of the DNA sequence and deduced amino acid 
15 sequence thereof via recombinant DNA technology. It 
further provides the means and methods foj^ producing 
useful amounts of such variant proteins by expression of 
DNA in host microorganism or cell culture* 

20 Means and methods for the microbial production of 

numerous polypeptides, including human growth hormone, 
were disclosed by Goeddel et al in U.S. S.N. 55126, filed 
July 5, 1979, which is hereby incorporated by reference. 
Counterparts of the application have been published, e.g. 

23 British patent application publication no. 2055382 and 
European patent application publication no. 222^2. 

Methods for producing various heterologous polypeptides 
from microorganism hosts have required the identification 

30 of the DNA sequence encoding the particular desired 

product. Once known, the sequence could be fashioned, 
synthetically or from tissue derived cDNA, and operably 
inserted into expression vectors that were then used to 
transform the host organism and thus direct its 

3^ production of the polypeptide- In the past, relatively 
small proteins could be produced by synthesizing the 
entire gene. The first example of this was human insulin 
(see British patent application publication no. 2007676, 
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1 for example). For polypeptides too large to admit of 
microbial expression from entirely synthetic genes, cDNA 
vas obtained from mRNA transcripts derived from tissue. 

5 In each case, knowledge of the sequence of the amino 

acid was essential so that the correct sequential synthesis 
was conducted in the one case and that" correct homologous 
sequence probes for isolating appropriate mRNA sequences 
could be used in the other. 

10 

Problems are encountered where the sequence of the 
polypeptide is unknown and the source of the polypeptide 
contains insufficient amounts to insure extraction of 
enough materials for -sequencing • Thus, hitherto employed 
15 methods fail where the gene is expressed in unknown 

tissue or in undetectable amounts or in very few cells of a 
tissue or because of other reasons related to limitations 
imposed by the current state of the art concerning the 
isolation of rare roRNA. 

20 

It was perceived that improved refinements in 
recombinant DNA techniques would be useful in providing 
desired heterologous protein under such circumstances. 

^5 The present invention is based upon the discovery that 
recombinant DNA technology can be used to advantage in 
isolating genes in sufficient amounts to permit 
sequencing where in the native state the products of 
such genes are either not produced in identifiable amounts 
or in amounts insufficient to permit their useful 
isolation. 

^ 

This invention provides a process involving probing the 
native gene bank, derived from genomic DNA, to obtain 
genomic sections containing the desired gene together with 
any naturally occuring intron sequences associated with it- 
These sections are incorporated into. host cells such that 
transcription faithfully produces corresponding mRNA 
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1 transcripts. Such transcripts are devoid of sequence(s) 
corresponding to any intron sequence that may have been 
present in the original gene, being spliced out as part of 
the post-transcription processes. From the mRNA transcript 

5 a cDNA bank is prepared from which the cDNA encoding 
the desired protein product is isolated and thereafter 
-operabl-y inserted into an expression vector via procedures 
known per se. Unique is the method of producing desired 
mRNA, and thence, cDNA, for expression vector use, from 
10 total genomic DNA, without the benefit of hindsight 
application of DNA sequence knowledge. 

This is a method of general application by which mRNA, and 
thence cDNA, can be obtained from a portion of 
15 chromosomal DNA containing a gene of human, mammalian, 

or other eucaryotic origin. The method is useful in those 
cases where the gene, but not the mRNA derived from this 
gene, can be isolated. This occurs because the product 
of gene expression is not detectable, for reasons set 
20 forth supra . A small segment of chromosomal DNA 

carrying the desired gene can be isolated from a genomic 
library employing a suitable DNA hybridization probe. 
This DNA is then introduced into the nuclei of suitable 
tissue culture cells by means of any one of a number of 
25 existing techniques that guarantee the efficient expression 
of the gene. The gene is transcribed, the primary 
transcription product processed by removal of intron 
sequences, capping and polyadenylation and the final 
transcription product, the mRNA, appears in the 
cellular cytoplasm as a template for protein synthesis. 
Thus, a cDNA bank derived from the polyadenylated , 
mRNA from such cells will contain cloned cDNA derived 
from the gene of interest and this cDNA can then be 
processed for bacterial expression by standard procedures - 
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Several methods are currently known for the introduction 
of foreign genes into the nuclei of cells (microinjection, 
cotransformation using genetic markers and eucaryotic 



008 96 66 

^ 

vectors derived from the genomes of animal viruses). Only 
a few result in the efficient expression of introduced 
genes. One system with such properties is the COS cell 
which carries the SV^O T antigen gene in its chromosomes 
such that DNA containing the SV^IO origin of replication 
introduced into the cells will replicate in high copy 
number. Efficient expression of DNA physically linked 
to this origin is made possible by utilizing the SV^O 
late promoter function. 

This aspect of the present invention can be illustrated 
by a particular scheme as follows: 

GENE BANK 

I SPECIFIC 
4/ PROBE 
CHROMOSOMAL GENE 
CLONE INTO 
SHUTTLE VECTOR 

V FOR COS CELLS AND E. COLI. 
GENE EXPRESSION VECTOR FOR COS CELLS 

j TRANSFECT 
>L COS CELLS 

I ISOLATE mRNA 

V FROM COS CELLS 

^ CLONE cDNA 
cDNA BANK 

I SPECIFIC PROBE 
CLONED cDNA DERIVED FROM CHROMOSOMAL GENE 

^ 

The present invention is further based upon the 
discovery of variants of human growth hormone. In 
particular, a representative gene coding for an HGH 
variant protein and containing four- intervening 
sequences has been isolated from a human genomic library. 
No tissue is known which produces this protein ; in fact , 
the existence of the HGH variant (HGH-V) protein has; 
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1 never been described- To produce the protein by 

recombinant DNA technology it is essential to obtain a 
cloned cDNA copy of the mRNA of this gene. Using the 
method described above, a tissue culture system is 
5 employed to produce the desired mRNA (the starting 

material for cDNA synthesis). The gene is cloned into 
a bacterial vector (plasmid pML) containing the SV^IO 
origin of the replication and late promoter function- 
In this construct, the HGH-V gene is linked to the SVUO 
10 promoter to ensure transcription of the correct strand. 
After obtaining sufficient amounts from a bacterial 
culture this DNA is used to transfect COS cells. The 
expression of the HGH-V gene is monitored in transfected 
cells by RNA analysis and RIA. Approximately five days 
15 after infection cells^ are harvested, RNA is extracted and 
polyadenylated RNA is prepared. Double -stranded cDNA is 
synthesized and a cloned bank of transfected COS cell 
cDNA is established by standard procedures. This bank 
is probed with an appropriate nucleic acid probe to 
20 detect colonies carrying cloned cDNA derived from HGH variant 
mRNA. DNA sequence analysis is used to determine 
whether the cloned HGH-V cDNA is devoid of intron sequences 
indicating correct splicing of the primary gene transcript 
in COS cells. Standard technology is used to construct 
25 a bacterial expression vector for the HGH variant 
protein starting with the cloned cDNA. 

This invention is directed to method of isolating mRNA 
and to HGH variant in all of their respective aspects, and 
30 is not to be construed as limited to any specific details 
described herein embraced within the general compass of < " 
this invention. 

Being variants of human growth hormone (HGH), that is 
3^ known to be useful, inter alia , for the treatment of 
hypopituitary dwarfism, the products hereof are 
doubtless implicated in the general anabolic and other 
metabolic activities of the processes involving HGH 
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itself. Thus, they would be useful in evoking activities 
within the sphere of general anabolic and other metabolic 
activities ascribed to HGH and may prove to have divergent 
activities outside of an overlapping set common with HGH. 



Description o f Preferred Embodiments 

A. Microorganisms /Cell Cultures 

10 1- Bacterials Strains /Promoters 

The work described herein was performed employing the 
microorganism E. Coli K-12 strain 29^ (end A. thi" hsr" 
j^hsm ). This strain has been deposited with the 

15 American Type Culture Collection, ATCC Accession No. 
Sn^je. However, various other microbial strains are 
useful, including known E. Coli strains such as coli B, 
E. coli X 1776 (ATCC No. 31537) or other microbial strains 
many of which are deposited and (potentially) available 

20 from recognized microorganism depository institutions, 
such as the American Type Culture Collection (ATCC)— cf. 
the ATCC catalogue listing. These other microorganisms 
inlcude, for example, Bacilli such as Bacillus subtilis 
and other enterobacteriaceae among which can be mentioned 

25 as examples Salmonella typhimurium and Serratia marcesans . 
utilizing plasmids that can replicate and express 
heterologous gene sequences therein. 
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As examples, the beta lactamase and lactose promoter 
systems have been advantageously used to initiate and 
• sustain microbial production of heterologous polypeptides 
More recently, a system based upon the tryptophan operonr 
the so-called tr£ promoter system, has been developed. 
Numerous other microbial promoters have been discovered 
and utilized and details concerning their nucleotide 
sequences, enabling a skilled worker to ligate them 
functionally within plasmid vectors, are known. 
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1 2. Yeast Strains /Yeast Promoters. 

The expression system hereof may also employ a plasraid 
which is capable of selection and replication in both 
5 E. coli and the yeast, Saccharomyces cerevisiae . One 
useful strain is strain RH218 deposited at the American 
Type Culture Collection without restriction (ATCC No. 
I1UO76). However, it will be understood that any 
Saccharomyces cerevisiae strain can be employed. 

10 

When placed on the 5' side of a non-yeast gene the 
5 '-flanking DNA sequence (promoter) from a yeast gene can 
promote the expression of a foreign gene in yeast when 
placed in a plasmid used to transform yeast. Besides 
15 a promoter, proper expression of a non-yeast gene in 
yeast requires a second yeast sequence placed at the 
3 '-end of the non-yeast gene on the plasraid so as to 
allow for proper transcription termination and polyadenyl- 
ation in yeast. This promoter can be suitably employed 
20 in the present invention as well as others ~ see infra. 

Because yeast 5 '-flanking sequence (in conjunction with 
3' yeast termination DNA) ( infra ) can function to promote 
expression of foreign genes in yeast, it seems likely that 
25 the 5 '-flanking sequences of any highly-expressed yeast 
gene could be used for the expression of important gene 
products. Any of the 3 '-flanking sequences of these 
genes could also be used for proper termination and 
mRNA polyadenylation in such an expression system. 

30 

Many yeast promoters also contain transcriptional control ^ 
so that they may be turned off or on by variation .in growth 
conditions. Some examples of such yeast promoters are the 
genes that produce the following proteins: Alcohol 
®^ dehydrogenase II, isocytochrome-c , acid phosphatase, 

degradative enzymes associated with nitrogen metabolism, 
glyceraldehyde -3-phosphate dehydrogenase, and enzymes 
responsible for maltose and galactose utilization. 
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Such a control region would be very useful in controlling 
expression of protein products - especially when their 
production is toxic to yeast. It should also be 
possible to put the control region of one 5 '-flanking 
sequence with a 5 '-flanking sequence containing a promoter 
from a highly expressed gene. This would result in a 
hybrid promoter and should be possible since the control 
region and the promoter appear to be physically distinct 
DNA sequences. 

3. Cell Culture Systems /Cell Culture Vectors 

Propagation of vertebrate cells in culture (tissue culture) 
has become a regular procedure in recent years. The 
COS-7 line of monkey kidney fibroblasts may be employed 
as the host for the production of animal interferons (1). 
However, the experiments detailed here could be performed 
in any cell line which is capable of the replication and 
expression of a compatible vector, e.g., WI38, BHK, 3T3, 
CHO, VERO, and HeLa cell lines. Additionally, what is 
required of the expression vector is an origin of replication 
and a promoter located in front of the gene to be expressed, 
along with any necessary ribosome binding sites, RNA splice 
sites, polyadenylation site, and transcriptional terminator 
sequences. While these essential elements of SV40 have 
been exploited herein, it will be understood that the 
invention, although described herein in terms of a 
preferred embodiment, should not be construed as limited 
to these sequences. For example, the origin of 
replication of other viral (e.g., Polyoma, Adeno, VSV, 
BPV,and so forth) vectors could be used, as well as 
cellular origins of DNA replication which could 
function in a nonintegrated state. 

B. Vector Systems 

A useful vector to obtain expression consists of pBR322 
sequences which provides a selectable marker for selection 
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1 in JE. coli (ampicillin resistance) as well as an -E. coll 
origin of DNA replication. These sequences are derived 
from the plasmid pML-1 (2) and encompasses the region 
spanning the Eco RI and Bam HI restriction sites. The SV^O 

5 origin is derived from a 3^2 base pair Pvu II -Hind lll 
fragment encompassing this region (both ends being 
converted to Eco RI- ends). These sequences, in addition 
to comprising the viral origin of DNA replication, encode 
the promoter for both the early and late transcriptional 
10 unit. The orientation of the SV40 origin region is such 
that the promoter for the late transcriptional unit is 
positioned proximal to the gene encoding interferon. 

Description of the Drawings 

15 

Figure 1 depicts restriction maps and functional 
organization of three human genomic DNA fragments 
containing members of the HGH gene family. 

20 Figure 2 shows the nucleotide sequences of two HGH genes 
and one HCS gene, with flanking regions. Nucleotide 
numbers refer to the HGH-N gene sequences^ the first digit 
of any number appearing above the corresponding 
nucleotide. Positively numbered nucleotides start at 

2^ the presumed cap site, and negatively numbered 

nucleotides are assigned to 5 '^flanking sequences - 
Negatively numbered triplets code for the respective 
signal peptides, positive numbers refer to the codons 
for the mature peptides. The TATAAA and AATAAa sequences 
diagnostic of eucaryotic structural genes are underlined. 
The human Alu family sequences in the 3 '-flanking region 6f 
the two HGH genes are shown by lines above the nucleotides. 
The short region in the HCS gene homologous to one end of 
the Alu family sequences in the HGH genes is underlined. 

35 

Figure 3 illustrates one HGH variant amino acid and 
nucleotide sequence* The primary structure of the protein 
was derived solely from the coding portion of the exons in 
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the HCH-V gene (see Figure 2). Differences in nucleotide 
sequence to the HCH-N gene and to the HGH protein sequence 
(3) are indicated below the variant sequences. Amino 
acid residue 9 in HGH can be proline or leucine as 
5 determined by cDNA sequencing (H). 

Figure H illustrates a cell culture expression vector 
construction hereof, namely pSV-HGH-V, harboring the gene 
encoding a HGH variant protein hereof. 

0 



Gene isolation and characterizahinn 

A human genomic library in bacteriophage A (5 ) was 
' screened for members of the human HGH gene family by 

in situ plaque hybridization (6) with cloned HGH cDNA 
sequences (ii.?). This fragment was either 2.6, 2.9, 
or 9.5 kb in length as expected from a similar ' 
analysis of human genomic DNA (8). Hybridizing fragments 
were subcloned into the EcoRI site of plasmid pBR325 (9) 
and three (2.6, 2.6, and 2.9 kb) were chosen for a 
complete DNA sequence determination. The two smaller 
fragments originated from different phage isolates and 
had distinct restriction maps. 

For sequence analysis the three genomic DNA fragments were 
excised from the plasmid DNA and Isolated by polyacrylamide 
gel electrophoresis. Restriction maps for several 
specific endonucleases were obtained by standard methods 
and are shown in Figure 1 . Overlapping segments 
corresponding to defined restriction fragments were 
inserted into phage M13mp7 RF-DNA (10) and single-stranded 
recombinant phage DNAS were used as templates in enzymatic 
sequencing reactions (11) using a synthetic oligonucleotide 
as universal primer (10). 



complete nucleotide ^ sequences of the three genomic 
fragments aligned to one another and segmented into 
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1 exons, introns, and nontrascribed (flanking) regions are 

shown in Figure 2. Segmentation was achieved by comparison 
to the primary structures of HGH and HCS cDNA, and 
facilitated by the close homology of the three sequences. 

5 Whereas the entire nucleotide sequence of HGH mRNA (except 
for part of the 5 ' -untranslated region) is known (^,7) only 
a large part of the sequence of HCS mRNA has been reported 
(12,13). 

10 The following description provides further detail 
enabling the practice of the invention. 

Approximately 10^ recombinant A phage carrying human 
chromosomal DNA were plated onto 20 150 miri petri dishes 
15 and screened (6) with radioactively labeled cloned HGH 

o 

cDNA (7) (specific activity >10 cpm/pg, approximately 
10^ cpm per filter). The preparation of this probe 
was as described (8). Twelve phages were isolated, 
subjected to 2 rounds of screening for phage purification, 

20 grown in E. coli strain DPSOsupF (ATCC No. 39061, 
deposited March 5t 1982) and prepared from lysed 
cultures as described (1M). Aliquots (1 yg) from each 
phage DNA were digested with endo Eco RI and Southern 
blots (15) of the digests were hybridized with the cloned 

2^ HGH cDNA probe. Hybridizing fragments were either 

9.5, 2.9 or 2.6 kb in size and were subcloned into the 
EcoR I site of plasmid pBR325 (9) using the same HGH probe 
in colony hybridizations (16) on chloramphenicol 
sensitive clones. Three subcloned Eco RI fragments 2*6, 

2.6, and 2.9 kb long and containing two HGH genes and one 
HCS gene were subjected to a crude restriction analysis. *^ 
Plasmid DNA was extracted from 1 1 cultures amplified 
with chloramphenicol (200 wg/ml) during the log phase of 
growth. DNA extraction and purification by a cleared 
lysate technique was essentially as described (17). RNA 
was removed by digestion with RNAase (10 pg/ml) and 
chromatography on agarose A50m. Plasmid DNAs were cut to 
completion with endo £co RI and cloneid DNA fragments 
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1 isolated by excision from 6 percent polyacrylamide gels. 
To obtain maps gel-isolated DNA fragments were cleaved 
with one or more of the following restriction endonucleases 
(supplier BRL): BamHI, Bglll, PvuII, PstI, Smal, and Xba l, 
5 Typically, reactions were in 20 pi of lOmM Tris. HCl . , 
pH 7.5, 0.1 mM EDTA, 7mM MgCl2, 10 mM DTT, containing 
500 rag of DNA and 2 U of enzyme, and were incubated for 
1 hr. at 37^C. Digests were separated on 6 or 8 percent 
polyacrylamide gels with parallel runs of plasmid pBR322 
10 DNA cut with endo Hinf or Hae lll for size markers. DNA 
in gels was visualized by ethidium bromide staining and 
UV light . 

The exact locations of the enzyme recognition sites shown 
15 above were obtained from the final DNA sequences (see 
Figure 2). The location of exons was determined from 
comparisons to cloned cDNA sequences as described in the 
text* The stippled boxes in the 3* flanking regions of 
two HGH genes indicate the location of members of the 
human Alu family. Arrows show the strategy used for 
sequencing more than 95 percent of the three genomic DNA 
fragments. Overlaps were generated and sequences 
confirmed by sequencing selected gel-isolated restriction 
fragments . 

25 

The sequencing strategy indicated by the arrows in Figure 
1 yielded more than 95 percent of the DNA sequences shown 
above. Necessary overlaps were generated and unclear 
gel data resolved by sequencing selected gel-isolated 

30 

sections- Plasmid-cloned and gel-isolated genomic 
Eco RI fragments (300 ng) were cleaved with one or two s 
of the restriction endonucleases employed for mapping 
(see also arrows in Figure 1). Reaction conditions were 
as described in the legend to Figure 1 , except that 

35 

deoxyribonucleoside triphosphates (50 yM each) and E^. coli 
DNA polymerase I large fragment (0.5 U, Boehringer 
Mannheim) were added to generate blunt -ended DNA pieces 
for convenient insertion into a phage vector. Digested 
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1 DNA was extracted with phenol /chloroform, precipitated with 
ethanol and ligated to HincII-cleaved phage Ml3rap7 RF-DNA 
(10). Ligation reactions (20 \xl) contained 20 ng RF-DNA, 
100 ng digested genomic DNA fragment and 2 U DNA ligase 
5 (New England Biolabs ) in 50 mM Tris.HCl, pH 8.0, 0.1 mM 
EDTA, 10 mM MgCl2, 10 mM DTT, and 500 \iH rATP. Reactions 

—^—.w^^ri^i incubated at rooin —temperature for ^ hr. and used to 
transform E. coli . Plating of transformation mixtures, 
plaque selection, phage growth and preparation of single- 

10 stranded DNA templates for sequencing were as described 
(10). To avoid redundancy templates were sorted by 
single track analysis (18) prior to complete sequencing. 
Selected DNA templates were sequenced by the dideoxynucleo- 
tide chain termination method (11) using a 15 nucleotide 

15 loag synthetic primer (10). Sequencing reactions (5 yl, 
30.0 ng template DNA) were terminated by the addition of 
10 \il of 98 percent deionized formamide, 10 mM EDTA, 
0.2 percent bromophenol blue and 0.2 percent xylene cyanol. 
Terminated reaction mixtures were heated for 3 min. at 

20 100°C and 1 yl aliquots were electrophoresed on ^0 cm long 
5 percent polyacrylamide - 8M urea "thin" gels (19) for 2 
to 6 hr. at 30 mA/1.8 kV, Gels were then transferred onto 
Whatman 3 MM paper, vacuum-dried, and exposed to X-ray 
film for an average of 12 hrs. 

25 

Function and organization of sequences within genomic 
fragments. 

The functional organization of the highly homologous 

on 

sequences within the three fragments is identical through- 
out the first 2,200 base pairs. Each fragment contains^ 
one gene intact with regard to all currently known 
sequence features (see below), and several hundred base 
pairs of non-transcribed sequences flanking the genes. 
Approximately ^70 base pairs from the 5* end of each 
fragment is a TATAAA sequence characteristic of 
eukaryotic promoters (20). The beginning of exon I 
(cap site) was tentatively assigned to an A residue 30 
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1 nucleotides downstream of this regulatory sequence. This 
location of the cap site is in good agreement with that of 
many eukaryotic genes (21-25) and precedes the Bam HI site 
present in all three fragments by one nucleotide. Each 

5 gene has the potential to code for a protein of 217 amino 
acids, the first 26 constituting a signal peptide. The 

--ceding sections are interrupted four times at identical 
locations by small introns,. between 260 base pairs (intron 
A) and 90 base pairs (C) in size. All intervening 
10 sequences start with a GT dinucleotide , and end with an 
AG, analogous to other reported splice sequences (26). 



The first exon (exon I) contains approximately 60 
nucleotides of 5 * -untranslated sequence, the first 3 codons 

15 of the signal peptide and the first nucleotide of the ^th 
codon. The second exon starts with the t_wo remaining 
bases for the fourth codon and carries the rest of the 
coding region for the signal peptide together with 31 codons 
for the mature protein. The third exon consists of ^0 

^0 triplets (for amino acids 32 and 71), the fourth exon of 
55 (72 to 126) and the fifth of 65 (127-171). The last 
exon extends past the translational termination signal by 
approximately 100 nucleotides and features the AATAAA 
sequence common to most polyadenylated mRNAs (27), about 

^5 20 bases upstream of the presumed transcriptional termination 
site . 



The 3 '-flanking regions show close homology for about 
100 nucleotides, at which point both 2.6 kb EcoR I 
fragments diverge in sequence from the 2.9 kb DNA. 
They contain a block of middle repetitive sequences < * 

(nucleotides 1732-2005) as evidenced by the intense smear 
on a Southern blot when these regions are used to probe 
restricted human DNA. Comparison to a consensus sequence 
(28) reveals that these sequences are 270 base pair long 
members of the human Alu family suggested to function as 
origins in DNA replication (29). The Alu family 
sequences are known to be transcribed by RNA polymerase 
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1 III (30) and are inserted such that transcription would 
be from the opposite strand to that of the HGH genes. 
A detailed comparison of these regions with those similar 
in other human genes is presented elsewhere (31)- 

5 

The remainder of the sequences on the 2.6 kb fragments 
have unknown function. The function of the sequences 
that constitute the last 700 nucleotides of the 2.9 kb 
fragment is also not known. Probing human genomic DNA 

10 with these latter sequences shows that they occur in the 
2.9 and 9.5 kb genomic EcoR I fragments but not in the 2.6 
kb fragments. EcoRI fragments of all three size classes 
hybridize when the whole 2.9 kb DNA is used as a probe,, 
indicating the absence of repetitive sequences in this 

15 DNA. 

The genes and their products 

All three non-allelic fragments contain functional genes 
20 as judged by the presence of promoters and polyadenylation 
signals, correct intron-exon junctions and by the absence 
of codon aberrations in exons (e.g. deletions, insertions, 
additional stop codons) which would lead to truncated 
translational products. In expression, hnRNAs of 

25 approximately 1,650 bases are produced that can be 
processed to 800 nucleotide long mRNAs (not counting 
the 3'-polyA tail), containing 60 and 100 bases of 5'- 
and 3 '-untranslated regions respectively. The primary 
translation products of these mRNAs are proteins of 217 

3° amino acids including an N-terminal signal sequence of 
26 residues. 

The fact that the exons contained on the first sequence 
in Figure 2 are nucleotide for nucleotide the same as in 
35 cloned HGH cDNA (4,7) suggests that the corresponding 
genomic fragment contains the human growth hormone 
gene expressed in the pituitary to produce somatotropin. 
This contention is "corroborated by the finding that no 
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HGH is produced in individuals afflicted with a deletion 
in chromosome 17 which spans a 2.6 kb Eco RI fragment 
bearing identical map characteristics. The non-allelic 
EcoHI fragment containing two Bam HI sites is not affected 
by this deletion and the presence of the gene cannot 
restore growth (32). As shown in Figure 5, the protein 
product of this gene differs from preHGH by 15 amino 
acids. Two of the differences occur in the signal 
peptide and are conservative in nature. It is important 
to note that many of the amino acid changes in the mature 
part of the protein are non-conservative and are expected 
to change the properties of the protein considerably. 
Such changes occur at positions 18 (His -> Arg), 21 (His 
-> Tyr), 65 (Gin Val), 66 (Glu ^ Lys), 112 (Asp i 
Arg), 113 (Leu ^ His), 126 (Gly ^ Trp), 1i|0 (Lys ^ Asn), 
and U9 (Asn 4 Lys J . Thus, this protein has lost two 
acidic amino acids and gained three basic ones over HGH 
leading to an increase in isoelectric point from 5,5 for HGH 
to 8.9 for the variant. 

These changes result in a 20-fold lower cross-reactivity 
to HGH antibodies (33) and are expected to lead -to a 
considerable increase in isoelectric point. Curiously, 
its receptor binding efficiency seems to be comparable 
to HGH (33) making the variant a possible competitive 
inhibitor of HGH action. The latter results were 
obtained by characterizing the protein produced from the 
variant gene employing an SV40 expression system. 
Although transcription of the gene was controlled primarily 
by a viral promoter (33), the fact that the 5 '-flanking 
sequences of both the HGH-V and the HGH-N gene contain a< ^ 
functional promoter (3^) strongly suggest that the HGH-V 
gene is a functional gene expressed in^ viv£. 

It is not known whether the variant protein is actually 
produced in vivo and if so, in which tissue. Although 
the pituitary is a probable production site for the 
protein one should bear in mind that the known tissue 
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1 specificity of HGH and HCS gene expression could hold for 
other members 6f the HGH gene family which may be 
synthesized elsewhere in the body. 

5 Different from HGH and HCS. and also from all the known 
animal growth hormones (35). the HGH variant features a 
second tryptophan residue. In addition- to the amino 
acid changes 17 nucleotide differences result xn 
synonymous codons and 2 and H changes occur in the 5' and 

10 3' untranslated regions respectively- 

It is of interest to point out that the HGH-V gene may 
also code for a protein by 15 residues smaller than the 
101 amino acid long product described above. Thxs xs xn 
15 formal analogy to the biosynthesis of HGH where 

approximately 10 percent of the pituitary-produced hormone 
consists of a shorter version of HGH ("20 K variant") 
which is missing amino acid residues 31-'»5 (36)- The 
hypothesis that HGH and its deletion variant are 
20 generated by different splicing events of the same 
primary transcript was proposed by Wallis (37). and 
recent data seem to substantiate this notion (25)- 
The coding sequences for residues 31 to 45 — 
beginning of exon III and are identical in the HGH-N 
25 and HGH-V genes. Since both sequences carry a canonical 
splice site (nucleotides 730-745. see Figure 2). the 
primary transcript of the HGH-V gene could be spliced xnsxde 
cf exon III resulting in a shorter mRNA similar to the one 
which codes for the HGH deletion variant. Although the 
30 HGH-V gene has not been documented to be expressed in 
its intact nature, its chromosomal location within thaj: 
of the HGH gene family and the fact that it can be 
expressed in vitro (33) suggest that it is functional. 

35 p>.o«sKiP regulatory sequences 

The expression of the HGH gene is controlled by 
glucocorticoids and thyroid hormones. In cultured ra . 
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1 pituitary cells both types of hormones have a synergistic 
effect on the production of rat GH and its raRNA (38,39)- 
Such hormone action is mediated by receptor proteins which 
are thought to interact with specific DMA sequences located 

5 in the vicinity of responsive genes (38,39). Recently, 
hormonally responsive transcription of the HGH-N gene and 
HGH synthesis co"uld-be demonstrated In murine fibroblasts 
transformed with the respective human genomic 2.6 kb 
Eco RI fragment (3^)* The sequences involved in the 
10 hormonal induction are contained within 500 base pairs 

of 5*-flanking region as shown by fusing the corresponding 
section from the 2.6 kb DNA to the thymidine kinase gene 
and thereby rendering this gene responsive to 
dexamethasone (3^)- 

15 

. According to current models this. region provides sequence 
elements for the specific binding of glucocorticoid 
receptor-hormone complex (es) (^0). Although such specific 
interaction could be demonstrated with MMTV DNA and 

20 purified receptor protein (41) the respective binding 
site(s) has not yet been analyzed, leaving the nature 
of the relevant sequences open to speculation. Thus, 
any of the prominent features that occur in the 5 '-flanking 
sequence of the HGH-N gene could play a role in receptor 

25 recognition. These features consist of purine rich 
nucleotide stretches and palindromic structures. 

The Goldstein-Hogness box, for example, lies near the end 
of a stretch of 62 nucleotides (-81 to -20, Figure 2) of * 
3° which only 14 are pyrimidines. Such an uneven distribution 
of purines and pyrimidines can cause helix destabilizationf 
and may facilitate the local melting of DNA strands 
possibly involved in hormone-mediated transcriptional 
induction. This region spans the location of the CAT 
box found in other genes (21), but no good fit with a 
consensus sequence is detectable at an appropriate 
distance from the TATA box. Since the sequence in this 
position is thought to be involved in the rate of 
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1 transcriptional initiation (^2,^3), lack of homology 
to other systems may reflect a special type of 
transcriptional regulation of the HGH-N gene. 

5 An extended region with palindromes and inverted repeats 
is found between nucleotides -30^ and -198. Towards the 
middle lie two imperfect inverted repeat sequences of 15 
(-278 to -26^^) and 17 base pairs (-238 to -222) which are 
separated by 25 base pairs containing a section very 

10 rich in purines. Each inverted repeat is composed of 
two parts, 6 and 7 bases long with perfect homologies in 
their counterparts, but separated in one repeat by 2 
nucleotides and in the other by 4. Three palindromes 
occur in the vicinity (-290 to -285, -265 to -260, and 

15 -213 to -205) and two of them overlap with 15 nucleotides 
long imperfect inverted repeats (-304 to -290, and -I98 
to -212) located at the beginning and end of the whole 
region . 

20 Approximately 80 base pairs upstream is another highly 
purine-rich sequence (-371 to -358) where 31 out of 34 
nucleotides are purines. In the middle of this region 
one finds a repeat of the sequence GGATAG of which a 
single copy is found in the complementary strand 38 
25 base pairs downstream (-321 to -328). Sequence elements 
that display dyad symmetry such as palindromic structures 
and inverted repeats are possible candidates for 
interaction with hormone receptors, since such DNA 
structures are known to be involved in the regulation of 
procaryotic gene expression (^^). 



30 



It is worth noting that purine-rich regions are also 
found in the introns of the HGH gene, and that some of 
them show homology to small regions of other hormone- 
responsive sequences (e.g. MMTV C^5), rat GH genes (24) 
and mouse metallothionine gene (22)). Whether such 
homology is fortuitous, or related to hormone-responsive 
gene expression remains to be elucidated. 
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1 Expression of gene in cell culture 

The following description defines the. means and methods 

for isolating HGH-V rnRNA via an expression vector, 

5 pSV-HGH-V. The 3^2 base pair Hind lll-PvuII fragment 

encompassing the SVMO origin was converted to an Eco RI 

restrictioa-.site bound fragment- The Hind lll site was 

converted by the addition of a synthetic oligomer 

(5'dAGCTGAATTC) and the Pvu II site was converted by 

10 blunt-end ligation into an Eco RI site filled in using 

Polymerase I (KLenow fragment). The resulting Eco RI 

fragment was inserted into the Eco RI site of pML-1 (2). 

A plasmid with the SV40 late promoter oriented away 

from the amp gene was further modified -by removing the 

R 

15 Eco RI site nearest the amp gene of pML-1 (^6)- 

The 1023 base pair Hpal -Bglll fragment of cloned HBV DNA 
(^7) was isolated and the Hpa l site of hepatitis B virus 
(HBV) converted to an Eco RI site with a synthetic 
20 oligomer (5 *dGCGAATTCGC) . This EcoRI-Bglll bounded 

fragment was directly cloned into the Eco RI- BamH I sites 
of the plasmid described above carrying the origin of 
SVilO. 

25 Into the remaining Eco RI site was inserted the HGH-V gene 
on a 1250 base pair Pst I fragment of p69 after conversion 
of the Ps't l ends to EcoR I ends- Clones were isolated 
in which the SV^O late promoter preceded the structural 
gene of HGH-V. The resulting plasmids were then 
introduced into tissue culture cells (Gluzman et al- , 
Cold St)ring Harbor Sym. Quant. Biol. UJ^^, 293 (1980)) 
using a DEAE-dextran technique (48) modified such 
that the transfection in the presence of DEAE-dextran was 
carried out for 8 hours. Cell media was changed every 
2-3 days. 200 microliters was removed daily for 
bioassay- Typical yields were 300-500 ng/ml on samples 
assayed three or four days after transfection. 
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1 DNA encoding human growth hormone variants can be 
constructed for use in expression of protein in cell 
culture by using chemically synthesized DNA in 
conjunction with enzymaticall y synthesized DNA. The 
5 hybrid DNA, encoding heterologous polypeptide is 

provided in substantial portion, preferably a majority, 
via reverse transcription of raRNA while the remainder 
is provided via chemical synthesis. In a preferred 
embodiment, synthetic DNA encoding the first 2U amino 

10 acids of human growth hormone variant (HGH-V) is 

constructed according to a plan which incorporates an 
endonuclease restriction site in the DNA corresponding 
to HGH-V amino acids 23 and 2U. This is done to 
facilitate a connection with downstream HGH-V cDNA 

15 sequences. The various oligonucleotide fragments 
making up the synthetic part of the DNA are chosen 
following known criteria for gene synthesis: avoidance 
of undue complementarity of the fragments, one with 
another, except, of course, those destined to occupy 

20 opposing sections of the double stranded sequence; 

avoidance of AT rich regions to minimize transcription 
termination; and choice of microbially preferred codons. 
Following synthesis, the fragments are permitted to 
effect complementary hydrogen bonding and are ligated 
according to methods known per se. 



The greater portion of the DNA coding sequence can be 
provided as described above from genomic DNA. This 
portion encodes the C-terminal of the polypeptide and 
is ligated, in accordance herewith, to the remainder of**' 
the coding sequence, obtained by chemical synthesis, 
optionally including properly positioned translational 
start and stop signals and upstream DNA through the 
ribosome binding site and the first nucleotide (+1) 
of the resultant messenger RNA. The synthetic fragment 
can be designed by nucleotide choice dependent on 
conformation of the corresponding messenger HNA in order 
to avoid sfecondarv structure imoosed limitations 



1 on translation. 
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1 CLAIMS: 

1. A human growth hormone variant protein differing in 
structure from natural human growth hormone. 

5 

2. The human growth hormone variant protein having the 
amino acid sequence set forth in Figure 3 hereof. 

3. The human growth hormone variant protein according 
10 to claim 2 including the presequence thereof. 

A method of obtaining cDNA encoding a desired 
polypeptide which comprises the steps of: 

15 a) probing genomic DNA to obtain genomic sections 

containing gene for said polypeptide ; 
b) incorporating the genomic sections of step a) 
into host cells and permitting transcription 
of same into corresponding mRNA, and 

20 c) isolating the cDNA for said polypeptide by 

creating a cDNA bank from the mRNA of step b) 
and probing for the requisite DNA sequence of 
said polypeptide* 

25 5. The method of claim ^ useful for producing HGH-V 
cDNA. 

6. The method according to claim ^ wherein the genomic 
sections of step a) arie integrated into permissive Cos 
cell vectors. 

7. The DNA sequence encoding a human growth hormone 

variant protein having the sequence set forth in Figure 3 

hereof. 
35 ' 

8. The DNA sequence according to claim 7 including the 
presequence thereof. 
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1 9- A replicable cloning vehicle containing the DNA 
sequence of claim 7- 

10. An expression vector comprising a DNA sequence 
5 according to claim 7 operably linked to expression 
effecting DNA sequence and flanked by translational 
start and stop signals. 

n. A viable cell culture transformed with the 
10 expression vector of claim 10. 

12- A cell culture capable of producing the human 
growth hormone variant according to claim 2. 
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-IV; -40O 
hCH-H CAATTCAGGAClCAATCGTGCICACAACCCCCACAATCTArTCGClCTGCTT-CGCCCCTTlTCCCAACACACACATTCTCTCTCGT&GCTCGAGGTTAAACATCCGO^^ 
MCH-V GAAlTCAGCACTGAAlCATGCCCAGAACCCCCGCAAICTATTGGCTGTGCTTTGGCCCCUTICCCAACACACACATTCTGICIGGTGCliTGGAGGCC^^ 
HCS GAATTCAUGACTCAAlGGTGCTCAGAACCCCCACAAlCTAITGGCTbTGCTT-GGCCCCTTrTCCCAACACACACAriCTC 

-3 SO -300 
MCM.H GCAGGAAAGGGATAGGATAGAGAATCGoATGTCGICCGTAGGGCCTCTCAAGGACTGGCCTATCCTGACATCCTTCCCCCGCCTGCAGGTTGGCCACCATGOCCTGCGGCC 
HGH-V OCACGAAAGUAATAGGATAGAGAGTGGGATGGGGTCGGTAGGCG-TCTCAAbGACTCGCCTATCCTGACATCCTTCTCC-GCCTTCAGCTT&GCCACUTGCCCTGCTGCC 
HCS CGAGGAAAGGAATAGGATACAGAGTGGAAlGGGGICGGTAGGGG-TCTCAAGGACTGGC.TATCCTGACAGCCrTCCCC-GCGTTCAGGTTGACCAACA 

-250 -200 
HGH-H AGAGGGCACCCACGT6ACCCTTAAAGAGAGGACAAGTTGGG7GGTATCTCTGGCT6ACACTCT6TGCACAACCCTCACAACACTGGTGACGGTGGGAACGGAAACATGACA 
HCM-V ACAGGGCACCCACGTGACCCTTAAAGAGAGGAeAAGTTGGGTGGTATCTCTGGCTGACATTCTGTGCACAACCCTCACAACGCTGGIGATGGTGGGAAGGGAAAGATGACA 
HCS AGAGCGCACCCACCTGACCCTCAAAGAGAGGACAAGnGGGTGGAGTCTGlCGCTGACACTCTGTGCACAATCCTTACAACAICGGTGATGGTGAGAAGGGAAAGACoACA 

. . -ISO .100 
HGH-H ACCCACGCCGCATGATCCCAGCATGTGTGGGAGGAGCTTCTAAATTATCCATTAGCACAAGCCCGTCAGTGGCCCCATGCATAAATGTACCACAGAAACAGGTGGGGTCAA 
HGM-V AGTCAGGCGGCATGATCCaGCATGICTGGGAGGAGCTTCTAAATTAlCCATTAGCACAAGCCCGTCAGTGGCCCCACGCCTAAACAT-GUGAGAAACAGGTGACGAGAA 
HCS AGCCAGGGCGUTGArCCCAGUTGT6TGGGJ^G6AGCTTCCAAATTATCCATTA6CACAA6CCCGTCAGTGGCCCCATGCATAAATGTA-CACAGAA>iCAG&IGGK 

-30 -1 
HGH-H -CAGTGGGACAOAA— GGGCCCACGGTATAAAAAGGGCCCACAAGAGACCAGCTCA 
H6H-V CCAGCCAGAGAGAA— GGGGCCAGG-TATAAAAAGGGCCCACAACAGACCAGCrCA 
HCS 6CAGGGAGAGAGAAC T-GGGCC AGGGTATAAAAA6GGCC C ACAA6AG ACCGGC TC T 

lion I 

60 

HGH-H AGGATCCCAAGGCCCAACTCCCCGAACCACTCAGCGTCCTGTGGACAGCTCACCTAGCTGCA ATG GCT ACA G 
HCH-V AC&ATCCCAAGGCCCAACTCCCC6AACCACTCAGGGTCCTGTG6ACA6CTCAC-TAGCG6CA ATG GCf GCA G 
HCS AGGATCCCAAGGCCCAACTCCCCGAACCACTCAGGGTCCTGTGGACAGCTCACCTAGTGGCA ATG GCT GCA G 

-25 

INTROH A 

100 

HCH-H GTAACCGCCCCTAAAATCCCTTTGGCACAATGTGTCCTGAGGGGAGAGGCAGCGACCTGTAGATGGGACGGGGGCACTAACCCTCAGGGTTTGGGG-TTCTGAATGTGAG- 
HGH-V 6TAA6CGCCCCTAAAATCCCTTTGGCACAATGTGTCCTGAGGGCAGAGGCG6CGTCCTGTAGATCGGACGGCGGCACTAACCCTCA66-TTTC6GGCTTATGAATGTTAGC 
HCS GTAAGCGCCCCTAAAATCCCTTTGGCACAACGTGTCCTiiAGGGGAGAGGCAGCGCCCTGrAGATGGGACGGGGGUCTAACCCTCAGG-TrTGGGGCTIATGAAlGTaAG- 

200 

HCH-H TATCGCCATGTAAGCCCAG-TATTTGGCCAATCTCAGAAAGCTCGTGGCTCCCTGGAGC-ATGG AGA6AGAAAAACAAA CAGCTCCTC6AGCAGGGA 

HGH-V TATCGCCATCTAAGCCCAG-TATTT66CCAATCTCT6AATGTTCCT6G-TCCCTGGAGG-A-GGCAGAGAGAGAGAGAGAGAAAAAAAAAACCCAGCTCCTGGAACAGGGA 
HCS TATC6CCATCTAAGGCCAGATATTTC6CCAATCTCTGAAI6TTCCTG6-TCTCTGGAGGGATGG AGAGAGAGAAAAAAACAAA UGCTCCTGGAGCaGGGA 

300 

HGH-H GAGTGTTGCCCTCTTGCTCTCCGGCTCCCTCTGTTGCCCTCTGGnTCTCCCCAG 
HGH-V GAGCGCTGGCCTCTTGCTCTCCAGCTCCCTCTGTTGCC-TCCGGTTTCTCCCCAG 
HCS GAGCGCTGGCCTCTTCCTCTCCGGCTCCCTCaTTGCC-TCCGGTTTCTCCCCAG 

aOK 11 

400 

HGH-H CC TCC CGG AC6 TCC CT6 CTC CT6 GCT TTT GGC CT6 CTC TGC CTG CCC TGG CTT CAA GAG GGC AGT GCC TTC CCA ACC ATT CCC 
HGH-V CC TCC CGG ACG TCC CTG CTC CTG GCT TH GGC CTG CTC TGC CTG TCC TGG CTT CAA GAG GGC AGT GCC TTC CCA ACC ATT CCC 
HCS SC TCC CGG ACG TCC CTG CTC CTG GCT TTT GCC CTG CTC TGC CTG CCC TGG CTT CAA GAG GCT GGT GCC GTC CAA ACC GTT CCG 
-20 -1 1 



HA tec AGG CTT TTT GAC AAC GCT ATG CTC CGC GCC CAT CCT CTG CAC CAG CTC GCC TTT GAC ACC TAC UG GAG TTT 

nwt-f HA TCC AGG CTT TTT GAC AAC GCT ATG CTC CGC GCC CGT CGC CTG TAC CAG CTG GCA TAT GAC ACC TAT CAG GAG TTT 

HCS HA TCC AGG CTT TTT GAC CAC GCT ATG CTC CAA GCC CAT CGC CCG CAC CAG CTG GCC ATT GAC ACC TAC CAG GAG TTT 

20 



HGH 
HGH-V 



INTKOH B 

500 600 
HW-H GTAAGCTCTTCGGGAATGCGTGCGUTCAGGCCTCGCAGGAACCCGTGACTTTCCCCCGCTGGAAA-TAAGAGGAGGAGACTAAGGAGCTaGGCTTTTTCCCGACCCCGA 
MGH-V CTAAGCTCTTGGGTAATGGGTGCGCTTCAGAGGTGGCAGGAAGGGCTGAATTTCCCCCGCTCGGAACTAATGGGAGGACACTAAGGAGCTCACGGTT 
HCS GTAACTTCTTGGGGAATCGGTGCGGGTCAGGGGTGGCAAGAACGGGTGACnTCCCCCACTGtMG-TAATGGGAGGAGACTAAGGAGCTCAC^ 

HGK-H AAATGCAGGCAGAT6AGCACAC6CTGAGCTAGGTTCCCAGAAAAGTAA-AAT6GGA6CA66TCTC-A6CTCA6A CCTTGGTGG6CGGTCCTTCTCCTAG 

KCH-V AAATCGAGGCAGATGAGCATACGCTGAGTGACGTTCCCAGAAAAGTAACAATGGGAGCACGTCTCCACaTAGA CCTTGGTGGGCGGTCCTTCTCCTAG 

HCS AAATCaCGCAGATGAGCATAGGCTGAGCCAGGTTCCCAGAAAAGCAACAAT&CGACCTGGTCTCCAGCATAGAAACCAGCAGTC 

CXOH 111 

700 

HGH-H CAA CAA GCC TAT ATC CCA AAC CAA CAG AAC TAT TCA JIC CTG CAG AAC CCC CAG ACC TCC CTC T6T TTC TCA CAG TCT ATT CCG 
HGH-V CAA CAA GCC TAT ATC CTG AAC GAG CAG AAC TAT TCA TTC CTG CAG AAC CCC CAC ACC TCC'CTC TGC TTC TCA GAG TCT ATT CCA 
m CAA CAA ACC TAT ATC CCA AAC CAC CAG AAC TAT TU TTC CTG CAT GAC TCC CAC ACC TCC TTC TGC TTC TCA OAC TCT ATT CCG 

40 

800 

ACA CCC TCC AAC AGG GAG CAA ACA CAA CAC AAA TCC P I H 0 rt 

, ACA CCT TCC AAC ACC CTC AAA ACC CAG CAC AAA TCT 1 I U . il U 

HCS ACA CCC TCC AAC ATG CAC GAA ACG CAA CAG AAA TCC 
60 



HOt-H 
NCK^V 



INTRON C 

900 

KGH-M CTCACTGGATGCCTTCTCCCCACGCCGCCATCCGGCACACCTCTAGTCAGAGCCCCCGCGCAGCACACCCAATGCCCGTCCTTaCCCTCaG 
HCH-V CTCACTCGATaCTTCTCCCCAGCTCGC-ATCGGGTACACCTCTCCTCAGAGCCCCCGGaACCACAGCCACTCCCGGTCCTT-CCCCTGaG 
HCS CTGAGTCGAtTCCGTCTCCCTAGGCGCGGATCCCCCACACCTCTGGTCACGGCTCCCCCCCACCACACCCACFGCCGGTCCTT-CCCCTCCAG 
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hGH-M 
HCS 



HGH-H 
HCS 



HGH-H 
HCH-V 
HCS 



^ CU CTC CCC AIC KC CTO CTO CJC MC CAC TCC CfC C.C CCC CTG CAC TIC CK JCC «T UTC IIC 

Za' Clc CK c" MC KC clc 'cK CK MC 'cio KC CK UC CCC OK CGG TK CK AOG ACT A,0 .TC UCC AAC 

,CC i^OK TAC CCC CCC KT CAC AOC AAC CK TAT CAC CK CU AAC CAC CTA CAC UA CCC AK CAA ACC CK aK ^C 
'cK CK m «C KC '^1^ «T 'cIJ m cfc 'c't'a AAC CAC CTA CAC CAA CCC ATC CAA ACC CTC AK CCC 

100 



INTROK 0 



1100 



'clSMcfcSTac^Tccl^ 
t^^rVm a cMiic^cclcci^^^^^^^^^ 



KGH-N 

HCH-V — 

HCS AACtCACCTTATTCTTCATTTGCCCTCCl 



1300 

HCH-H CCTTGGCCTCTCCTTCTCTTCCTTCACTTTGCAG 
MGH-V GCTTCGCCTCTCCTTCTCTTCCTTCACTTTGCAG 
HCS GCTTGGCCTCTCCTTCTCTTCCTTCACTTTGCAG 

CXON V 



1400 



HCH-K 
HGH-V 
HCS 



HGH-H 
HGH-V 
HCS 



H6H-N 
HGH-V 
HCS 



HGH-H 
HGH-V 
HCS 



rrr raT rrr irr CCC frc ACT GGG CAG ATC TTC AAC CA6 ACC TAC A6C AAG TTC GAC AW AAC TCA CAC AAC GAT 6AC 
^ ^.T rrr ^1 CCC CGG fcT 6GG CAG ATC TTC AAT CAG TCC 7AC AGC AAG TTT GAC ACA AAA TCG CAC AAC GAT GAC 
III ^ "I ^'c JgI CGC CGG k] UG MC JtC AAG CAG ACC TAC AGC AAG TTT GAC ACA AAC TCA CAC AAC UT GAC 

140 

CCA CTA CTC AAC AAC TAC ^ CTC CTC TA^ TOC TK ACC AAC CAC ATC C« AAC CK CAC ACA TK CK CC^ ATC CTC CAC TCC 
^C^ l]l 1^ I« ^ l^l CK J« nC ice ^ ^ AT^ CAC AAC CTC CAC ACA TK CK CCC AK CK CAC TCC 



uo 



1500 

CGC TCT GTG GAG GGC AGC TGT 6GC TTC 
CGC TCT GTG GAG GGC AGC TCT GGC TTC 
CGC TCT 6TA GAG 66T AGC TGT GGC TTC 

191 
1600 

TCCAGTGCCCACCA6CCTT6TCCTAATAAAATTAAGTTGCATC 
TCCAGTGCCCACCA6CCTT6TCCTAATAAAATTAAGTTGCATC 
TCCA&TGCCCATCAGCCTTGTCCTAATAAMTTAACTTGTATCATTTC 



TAGCTGCCCGGGTGGCATCCCT6T GACCCGTCCCCAGTGCCTCTCCTCGCCCTGOAAGTTGCCAC 

UKTGCCCGGGTGGW GACCttTCCCCAGTCCCTCTCCTGGTCGTGGAAGGTCCTAC 

ISStcgScw 



3*-H0HTRAHSailB£0 j^QQ 

•* 1600 — — — — — ^— — 

gcaSccaagckc^^ 

a.TCACCACCCTCACCTAATT...CUMM.CU.AU^ 



HGH-H 
HGH-V 
HCS 



HGH-N 
HGH-V 
HCS 



HGH-H 
HOi-V 
4CS 



HCS 



HCS 



HCS 



HCS 



2X0 



4^ 

TATnCCCGufTAAATGCATGCCAACACTi^ACCTAaATAACCAACCT^^^^^ 

2400 

aCATCCAACICACACCCCTCCCAACCACCACAATCTCCAC^ 
ACCTTCTGAnCTnCAGACAGCCAGGAaTCCAGACTmACCACGAATTC 

FIG. 2 b 
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T4 DNA ligase 
Transform Eco/i 294 
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£co Rl 
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COS-7 cells 
OEAE-dextran 

Harvest cells 4th day 
post transfection 

Isolate mRNA 
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